On the role of old information in generating readable text: a psychological and computational definition of 'old' and 'new' information in the NOSVO system

نویسنده

Mark Vincent LaPolla

چکیده

There are at least two stages of text generation. One is generating the content of the text. The other is generating the language that represents and communicates the content (Thompson 1977). These two stages, though interrelated, have their owta sets of interesting problems and principles. The first stage, generating the semantic content of the text, involves motivating, planning and creating the conceptual and semantic content of a piece of text. Once the semantic representation for a text has been constructed the language of that text can be generated. The second stage, language generation, involves communicating the intent and content of the text Without confusing or misleading the reader. "~his paper will address the second stage only. It is not enough to merely generate text. It is also necessary to generate cohesive text. However a shopping list is cohesive, though not "flowing" text by any means. A set of sentences that are prepositionally related are cohesive though are not necessarily beautiful prose. It is not enough to attend to ellipsis and prouominalization to generate readable prose. We believe that there are other factors which must be attended to to generate prose. The NOSVO system is an attempt to take into account old/new information contrasts (Chafe 1974, 1976) which we believe will help natural language generation systems produce more readable text. 'assumable' as being there" (Prince 19'/8:819)i This is quite important and expands upon Ch~'e :.;ir, ce fbr him the important thing is that the antecede~t mu~t be in the hearer's consciousness, i.e. i-~ the l~earer, s tbcus of attention, while for Prince and LaPolla it need only be appropriate to the situation or in some other way coCperatively assumable, to ho in the fiearer's consciousness. Hajicov~ and Vbrov~t (19811 also takes exception with the terms "given (or old) " or "new" information and suggests that "contextuall!¢ bound" and "contextually non-bound" lexical item would be more appropriate. "contextually bound" and "contextually non-bound" is even more appropriate than "already activated" ~md "newly activated" because it seems to also convey situational appropriateness. However, it seems that Hajicov~t restricts her terminology, as well as her theory of discourse (focus) strueture~ to linguistic antecedents. That is, her "shared stock of knowledge" appears to be closer to, if not completely, linguistic in representation. Thet~f0re, neither her theory or terminology has the power to deal with mt antecedent that is merely inferable or appropriate to a situation. We will use the familiar terms "new/old infolmation" but will define them a little more precisely later in the paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

بررسی مقایسه‌ای مدیریت اطلاعات بالینی بیماریهای روانی ناشی از جنگ در کشورهای منتخب با ایران

Background and Aim: Today, psychological diseases like so many diseases, have an old history. Clinical Information System of psychological diseases resulting from war is a part of the information management system of mental illnesses, due to the management of mental patients from the war. This study is aimed to compare information management of psychological diseases in American, Australia and ...

متن کامل

آموزش سواد اطلاعاتی به کودکان 7 تا 11 ساله ایرانی

Purpose: To develop instructional objectives for implementing an information literacy instruction program for Iranian children (7-11 years old) based on the information literacy standards of American Association of School Library (AASL). Methodology: In this research, the following methods were used: a literature review in order to extract the instructional objectives of information literacy b...

متن کامل

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

A Sociological Definition and Categorization of Information Ethics

Background and Aim: This paper aims at the analysis of the definitions and categorizations of the realm of “Information Ethics” to criticize assumptions and clarify points of departure for introducing a new definition and categorization. Method: I used documentary research method and conceptual analysis approach. This method and approach is the best fits with the goal of pursuit roots of social...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1988

On the role of old information in generating readable text: a psychological and computational definition of 'old' and 'new' information in the NOSVO system

نویسنده

چکیده

منابع مشابه

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

بررسی مقایسه‌ای مدیریت اطلاعات بالینی بیماریهای روانی ناشی از جنگ در کشورهای منتخب با ایران

آموزش سواد اطلاعاتی به کودکان 7 تا 11 ساله ایرانی

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

A Sociological Definition and Categorization of Information Ethics

عنوان ژورنال:

اشتراک گذاری